Search CORE

28 research outputs found

Efficiency of using recombinant morphogenentic protein in patients with aggressive (rapidly-progressing) generalized periodontis

Author: Gudaryan Alexandr
Kucherenko Taras
Publication venue: Praha. — Česká republika, Nemoros
Publication date: 01/01/2021
Field of study

Annotation. Currently, the inclusion in the complex therapy of generalized periodontitis of drugs that significantly affect the processes of physiological remodeling and bone regeneration is becoming more common. Given the relevance of the search for drugs to restore periodontal bone structures, the aim of our study was to increase the effectiveness of standard therapy for aggressive (rapidly progressing) generalized periodontitis by additionally incorporating the morphogenetic bone protein rhbmp-2 into the generally accepted treatment complex. We observed a contingent of patients in the amount of 61 people, with a diagnosis of aggressive (rapidly progressing) generalized periodontitis, which were divided into the main - 30 people, and the comparison group - 31 people. we used standard clinical, paraclinical and laboratory research methods, supplemented by dental volumetric tomography. In the main group, patients in addition to the standard treatment regimen (comparison group) included the recombinant morphogenetic protein rhbmp-2. The results of a clinical examination conducted after 6-12 months revealed the absence of inflammatory phenomena in periodontium in 90% of the main group, and only 77.4% of patients in the comparison group. Measurements of bone density on the hounsfield scale (hu) in the same period showed a 2-fold increase in the density of periodontal bone structures, with a comparison group. it follows that the inclusion in the standard regimen of complex treatment for patients with rapidly progressing generalized periodontitis of the osteoinductive drug rhbmp-2 allows for longterm clinical and radiological remission, and creates conditions for the subsequent restoration of the density of periodontal bone structures

Repository for DZ "DMA"

Analyzing Input and Output Representations for Speech-Driven Gesture Generation

Author: Boersma Paul
Bütepage Judith
Kingma Diederik P
Kucherenko Taras
Matsumoto David
Pavllo Dario
Zhou Yi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/06/2019
Field of study

This paper presents a novel framework for automatic speech-driven gesture generation, applicable to human-agent interaction including both virtual agents and robots. Specifically, we extend recent deep-learning-based, data-driven methods for speech-driven gesture generation by incorporating representation learning. Our model takes speech as input and produces gestures as output, in the form of a sequence of 3D coordinates. Our approach consists of two steps. First, we learn a lower-dimensional representation of human motion using a denoising autoencoder neural network, consisting of a motion encoder MotionE and a motion decoder MotionD. The learned representation preserves the most important aspects of the human pose variation while removing less relevant variation. Second, we train a novel encoder network SpeechE to map from speech to a corresponding motion representation with reduced dimensionality. At test time, the speech encoder and the motion decoder networks are combined: SpeechE predicts motion representations based on a given speech signal and MotionD then decodes these representations to produce motion sequences. We evaluate different representation sizes in order to find the most effective dimensionality for the representation. We also evaluate the effects of using different speech features as input to the model. We find that mel-frequency cepstral coefficients (MFCCs), alone or combined with prosodic features, perform the best. The results of a subsequent user study confirm the benefits of the representation learning.Comment: Accepted at IVA '19. Shorter version published at AAMAS '19. The code is available at https://github.com/GestureGeneration/Speech_driven_gesture_generation_with_autoencode

arXiv.org e-Print Archive

Crossref

A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

Author: Ahuja Chaitanya
Henter Gustav Eje
Kucherenko Taras
Neff Michael
Nyatsanga Simbarashe
Publication venue: 'Wiley'
Publication date: 10/04/2023
Field of study

Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.Comment: Accepted for EUROGRAPHICS 202

arXiv.org e-Print Archive

Can we trust online crowdworkers? Comparing online and offline participants in a preference test of virtual agents

Author: Hauser David J
Jonell Patrik
Kaufmann Nicolas
Komarov Steven
Kucherenko Taras
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/09/2020
Field of study

Conducting user studies is a crucial component in many scientific fields. While some studies require participants to be physically present, other studies can be conducted both physically (e.g. in-lab) and online (e.g. via crowdsourcing). Inviting participants to the lab can be a time-consuming and logistically difficult endeavor, not to mention that sometimes research groups might not be able to run in-lab experiments, because of, for example, a pandemic. Crowdsourcing platforms such as Amazon Mechanical Turk (AMT) or Prolific can therefore be a suitable alternative to run certain experiments, such as evaluating virtual agents. Although previous studies investigated the use of crowdsourcing platforms for running experiments, there is still uncertainty as to whether the results are reliable for perceptual studies. Here we replicate a previous experiment where participants evaluated a gesture generation model for virtual agents. The experiment is conducted across three participant pools -- in-lab, Prolific, and AMT -- having similar demographics across the in-lab participants and the Prolific platform. Our results show no difference between the three participant pools in regards to their evaluations of the gesture generation models and their reliability scores. The results indicate that online platforms can successfully be used for perceptual evaluations of this kind.Comment: Accepted to IVA 2020. Patrik Jonell and Taras Kucherenko contributed equally to this wor

arXiv.org e-Print Archive

Crossref

HEMVIP: Human Evaluation of Multiple Videos in Parallel

Author: Henter Gustav Eje
Jonell Patrik
Kucherenko Taras
Wolfert Pieter
Yoon Youngwoo
Publication venue
Publication date: 01/01/2021
Field of study

In many research areas, for example motion and gesture generation, objective measures alone do not provide an accurate impression of key stimulus traits such as perceived quality or appropriateness. The gold standard is instead to evaluate these aspects through user studies, especially subjective evaluations of video stimuli. Common evaluation paradigms either present individual stimuli to be scored on Likert-type scales, or ask users to compare and rate videos in a pairwise fashion. However, the time and resources required for such evaluations scale poorly as the number of conditions to be compared increases. Building on standards used for evaluating the quality of multimedia codecs, this paper instead introduces a framework for granular rating of multiple comparable videos in parallel. This methodology essentially analyses all condition pairs at once. Our contributions are 1) a proposed framework, called HEMVIP, for parallel and granular evaluation of multiple video stimuli and 2) a validation study confirming that results obtained using the tool are in close agreement with results of prior studies using conventional multiple pairwise comparisons.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

Publikationer från KTH

Ghent University Academic Bibliography

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Understanding the Predictability of Gesture Parameters from Speech and their Perceptual Importance

Author: Hartmann Björn
Kucherenko Taras
Shapiro Ari
Smith Harrison Jesse
Stacy Marsella Chiu
Thiebaux Marcus
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/10/2020
Field of study

Gesture behavior is a natural part of human conversation. Much work has focused on removing the need for tedious hand-animation to create embodied conversational agents by designing speech-driven gesture generators. However, these generators often work in a black-box manner, assuming a general relationship between input speech and output motion. As their success remains limited, we investigate in more detail how speech may relate to different aspects of gesture motion. We determine a number of parameters characterizing gesture, such as speed and gesture size, and explore their relationship to the speech signal in a two-fold manner. First, we train multiple recurrent networks to predict the gesture parameters from speech to understand how well gesture attributes can be modeled from speech alone. We find that gesture parameters can be partially predicted from speech, and some parameters, such as path length, being predicted more accurately than others, like velocity. Second, we design a perceptual study to assess the importance of each gesture parameter for producing motion that people perceive as appropriate for the speech. Results show that a degradation in any parameter was viewed negatively, but some changes, such as hand shape, are more impactful than others. A video summarization can be found at https://youtu.be/aw6-_5kmLjY.Comment: To be published in the Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (IVA 20

arXiv.org e-Print Archive

Crossref

The GENEA Challenge 2023: A large scale evaluation of gesture generation models in monadic and dyadic settings

Author: Henter Gustav Eje
Kucherenko Taras
Nagy Rajmund
Nikolov Teodor
Tsakov Mihail
Woo Jieyeon
Yoon Youngwoo
Publication venue
Publication date: 24/08/2023
Field of study

This paper reports on the GENEA Challenge 2023, in which participating teams built speech-driven gesture-generation systems using the same speech and motion dataset, followed by a joint evaluation. This year's challenge provided data on both sides of a dyadic interaction, allowing teams to generate full-body motion for an agent given its speech (text and audio) and the speech and motion of the interlocutor. We evaluated 12 submissions and 2 baselines together with held-out motion-capture data in several large-scale user studies. The studies focused on three aspects: 1) the human-likeness of the motion, 2) the appropriateness of the motion for the agent's own speech whilst controlling for the human-likeness of the motion, and 3) the appropriateness of the motion for the behaviour of the interlocutor in the interaction, using a setup that controls for both the human-likeness of the motion and the agent's own speech. We found a large span in human-likeness between challenge submissions, with a few systems rated close to human mocap. Appropriateness seems far from being solved, with most submissions performing in a narrow range slightly above chance, far behind natural motion. The effect of the interlocutor is even more subtle, with submitted systems at best performing barely above chance. Interestingly, a dyadic system being highly appropriate for agent speech does not necessarily imply high appropriateness for the interlocutor. Additional material is available via the project website at https://svito-zar.github.io/GENEAchallenge2023/ .Comment: The first three authors made equal contributions. Accepted for publication at the ACM International Conference on Multimodal Interaction (ICMI

arXiv.org e-Print Archive